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THE HORIZONTAL ZERO IN FREQUENCY 
DIAGRAMS. 

By Earle Clark, Russell Sage Foundation. 



It is a generally accepted rule of graphic presentation that 
a zero, used in a diagram as a point of reference, should be 
included in the diagram. This rule, while it is observed in 
most statistical work, is almost universally disregarded in the 
drafting of frequency diagrams. 

Diagram 1, presented herewith, is a frequency graph of a 
common type, based on the weights of 738 men.* Weights 
are indicated on the base line, and the per cent, of cases. cor- 

Diagram 1. — Weights of 738 men, shown without horizontal zero. 
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responding to any given weight is proportionate to the verti- 
cal distance from the base line to the curve. A zero line is 
the most conspicuous feature of this diagram, but inspection 
of the figure shows that the presentation implies two zeros, 
and that only one of these is shown. The vertical scale, rep- 
resenting percentages, begins at the zero base line, but the 
horizontal scale, representing weights, begins at 90 pounds. 
It is the purpose of this paper to state reasons for including 
the horizontal zero, to direct attention to a type of frequency 



*The data are for 738 men bom in Wales, as shown in Yule's " Introduction to the Theory of Statistics," 
95. For convenience in presentation, the extremes of the diatribution have been arbitrarily shortened. 
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diagrams to which these reasons do not apply, and to illustrate 
methods of drafting. 

A frequency diagram is plotted for the purpose of showing 
the significant facts about a series of variables. The graphic 
form is used rather than a frequency table or text statement 
because most people, even most statisticians, find it easier to 
perceive and appreciate these significant facts by looking at a 
diagram than by studying a column of figures. The essential 
facts about a variable series are: (1) the mean, median, or 
other measure of central tendency, and (2) the distribution of 
the values about this central tendency. These facts are inter- 
dependent. It is a simple matter to compute medians or 
means, but these measures do not reveal the whole truth about 
a distribution; they may be seriously misleading unless shown 
in relation to the distribution of the individual values. 

On the other hand, the distribution is not in itself signif- 
icant unless related to the central tendency. Stated in pounds 
and ounces, the average deviation* of the weights of a group 
of 1,000 elephants would doubtless be far greater than the 
average deviation of the weights of 1,000 canary birds, but this 
would not necessarily mean that the weights of elephants are 
relatively more variable than the weights of canary birds. In 
order to determine the true variability of a series it is neces- 
sary to relate the measure of dispersion to the measure of 
central tendency. This may be done by computing a coeffi- 
cient of dispersion t — a ratio which expresses the dispersion 
as a proportion of the measure of central tendency. 

It follows that, if a frequency diagram is to serve the purpose 
for which it is intended, it must show, with all possible clear- 
ness and effectiveness, the distribution of the individual values, 
the central tendency, and the relation of the distribution to 
the central tendency. Diagram 1 shows the distribution of 
the measures. Does it also show, with the emphasis required, 
the two other essential facts? 

On Diagram 1 the median is indicated in the usual way — 

*The average deviation is a statistical expression indicating the dispersion of the values of a series of 
variables about their central tendency. It is obtained by adding together the differences between all the 
individual values and the central tendency, and dividing the result by the number of values. 

fThe coefficient of dispersion is the measure of dispersion (the average deviation, standard deviation, or 
probable error) divided by the measure of central tendency. 
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by a vertical line dividing into two equal parts the surface of 
the figure enclosed by the curve and the base line. This line 
is sometimes referred to as the median line, but the designa- 
tion does violence to the principles of graphic presentation. 
In diagrams, lines or areas are, or should be, proportionate to 
the quantities they represent. The length of the so-called 
"median line" is not proportionate to the median weight of 
men; it is proportionate rather, as the class interval for the 
distribution is 20 pounds, to the approximate number of men 
whose weights fall within limits fixed, respectively, at 10 pounds 
below and at 10 pounds above the median weight. The line 
represents, in other words, not the median value for the series, 

Diagram 2. — Weights of 738 men, shown with horizontal zero. 
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but a number of cases. There is nowhere on the diagram a 
line representing by its length, or a surface representing by 
its area, the median weights of the men. 

The median can be determined, it is true, by referring to 
the scale at the foot of the figure. As the point of intersec- 
tion of the so-called "median line" with the base line falls 
at 156 pounds, as indicated by the horizontal scale, it follows 
that this value is the median, but the result is not obtained 
by the graphic method. The figures on the scale are not 
graphic representations any more than are the figures of a 
table or a text statement. 

The median can, however, be shown by the graphic method 
by so extending the base line that the horizontal scale will 
include the zero. This method has been followed in pre- 
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paring Diagram 2. In Diagram 2 the horizontal distance 
from the vertical line at the left of the figure to the so-called 
"median line," measured on the base line or along any 
abscissa, represents the median weight of the men. 

If the inclusion of the horizontal zero is required for a com- 
plete graphical representation of the median, it is even more 
essential as a means of showing the relationship of the dis- 
persion to the median. As Diagram 1 contains no graphical 
representation of the central tendency, it follows that it affords 
no graphical representation of the relation between the central 
tendency and the dispersion. The dispersion of the series is 
indicated by the form of the curve and also by a line beneath 
the base line, proportionate in length to the average deviation 
(14.2 pounds), drawn to scale and extending to the left of the 
median. By including this line, the dispersion is reduced to 
a single graphical expression, but the diagram contains no 
graphical representation of the median with which either the 
line or the curve can be compared. 

An effective graphical representation of the relationship 
between the central tendency and the distribution is found in 
Diagram 2, in which the median, represented by the distance 
between the horizontal zero and the vertical "median line," 
can be compared both with the surface of frequency, as indi- 
cated by the curve, and with the line representing the average 
deviation. The ratio of the length of this line to the distance 
from the horizontal zero to the median line is equivalent to 
the coefficient of dispersion. 

The difficulties arising from the omission of the .horizontal 
zero are further illustrated in Diagram 3, in which the weights 
of the 738 men are compared with the weights of 279 thirteen- 
and fourteen-year-old school boys.* 

In Diagram 3 the scales for pounds are identical in both 
figures. The appearance of the diagram suggests that the 
two distributions are very much alike; as the figure for men 
has a greater spread at the base line than that for boys it would 
seem that the former represents, if anything, the wider dis- 
persion. This impression is not borne out by the data. The 

*The data, which are for boys attending the Worcester, Mass., public schools, are from a report by Franz 
Boas and Clark Wissler, published in the report of the V. S. Commissioner of Education for 1904. 
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actual dispersion (average deviation) is, roughly, the same for 
the two series: 14.2 pounds for the men and 14.3 pounds for 
the boys. But as the median for the men is 156.3 pounds, 
and that for the boys 90.8 pounds, computation shows that 
the significant measure of relative variability, the coefficient 

Diagram 3. — Weights of 738 men and 279 boys, shown without horizon- 
tal zeros. 
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of dispersion, is .157 for the boys and only .091 for the men. 
In other words, the dispersion of the weights of the boys is 
15.7 per cent, of the median weight of boys, while for the men 
the dispersion of the weights is but 9.1 per cent, of the median 
weight of men. The apparent similarity of the two distri- 
butions represented in Diagram 3 is, therefore, accidental and 
the diagram is misleading. 
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It may be said that anyone using Diagram 3 could deter- 
mine the relative dispersions by a study of the figures of the 
scales; that the scales show the medians, and that it is not 
impossible to relate these medians to the dispersions. This 
is true, but, as the same facts can be determined from a fre- 

Diagram 4. — Weights of 738 men and 279 boys, shown with horizontal 
zeros. 
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quency table, the argument offered is merely an argument for 
not using graphical representations for comparing two or 
more series of variables. 

Diagram 4 shows in graphic terms the true relationship 
between the dispersions. The base lines of Figures A and B 
of this diagram have been carried out to zero, and the scales 
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have been so adjusted that the distance from zero to the median 
is the same in both figures. It is now possible to view the dis- 
persions in their relationship to the central tendencies. The 
lines representing the average deviations, as well as the con- 
tours of the curves, show very clearly that the weights of 
boys are much more widely dispersed than the weights of men. 

The fact that in Diagram 4 the surface enclosed by the 
curve and base line of Figure B is much greater than that en- 
closed by the curve and base line of Figure A, might lead an 
incautious observer to assume that the dissimilarity in the 
appearance of the figures is due to a difference in the number 
of observations — that the number of boys exceeds the num- 
ber of men. Such an inference would be unwarranted. As 
numbers have been reduced to percentages, 100 per cent, is 
the total for each group. The values are plotted upon the 
ordinates; hence, the spaces between the ordinates, and the 
areas enclosed by the curves and the base lines, are without 
significance. It is believed that the diagram affords a correct 
interpretation of the data; that it gives an impression of two 
groups of which one is somewhat closely clustered about its 
central tendency, while the other is much more widely dis- 
persed. 

It should be noted that there is an important group of 
frequency diagrams to which the arguments in favor of in- 
cluding the horizontal zero, which have been stated in the 
preceding pages, do not apply. These are diagrams of dis- 
tributions in which the zero cannot be exactly located. In the 
so-called normal frequency distribution the base line and the 
ends of the curve are in asymptote — the ends and the base 
fine are tangent at infinity. It follows that, in plotting proba- 
bilities, or results in the psychological field which are based 
not upon concrete measurements but upon rankings, the hori- 
zontal zero can not be shown. 

But it is also impossible to show a zero based upon data of 
this kind in any type of diagram, and this is true whether the 
zero is vertical or horizontal. If the horizontal zero can not 
be shown in a frequency diagram representing the distribution 
of school boys with reference to a given mental trait, as deter- 
mined by the rankings of competent judges, neither can a 
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zero be shown in a diagram in which the ability of any one of 
these boys at successive tests is indicated by a historical curve. 
It is possible to present a horizontal zero in a frequency dia- 
gram for any data for which a vertical zero for an ogive curve 
can be shown. 

A practical objection to the inclusion of the horizontal zero 
is the fact that additional space is required. , But this objec- 
tion is no more applicable to the horizontal zero in frequency 
diagrams than to the vertical zero in line diagrams. The 
inclusion of the vertical zero in diagrams of the latter type is 
the established practice. And an inspection of the diagrams 
presented with this paper makes it clear that the inclusion of 
the horizontal zero presents no serious difficulties. A case 
will occasionally be encountered in which the dispersion con- 
stitutes so small a proportion of the central tendency that the 
zero, whether horizontal or vertical, must be omitted, but such 
■cases are most exceptional. 

The arguments and the illustrations presented in the pre- 
ceding pages seem to support the following conclusions: In 
frequency diagrams, where the position of the horizontal zero 
is exactly ascertainable, and where the dispersion is not too 
small in proportion to the measure of central tendency, the 
horizontal zero should be included in the diagram. This 
means that the horizontal zero should be included in a fre- 
quency diagram in all cases in which a zero for similar data 
would be included in any type of diagram. Without the 
horizontal zero the frequency diagram does not afford a com- 
plete graphical representation of the central tendency nor of 
the relationship of the central tendency to the distribution. 



